Back

Theoretical Population Biology

Elsevier BV

All preprints, ranked by how well they match Theoretical Population Biology's content profile, based on 47 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Gene genealogies in diploid populations evolving according to sweepstakes reproduction

Eldon, B.

2026-01-15 evolutionary biology 10.64898/2026.01.15.699673 medRxiv
Top 0.1%
28.1%
Show abstract

Recruitment dynamics, or the distribution of the number of offspring among individuals, is central for understanding ecology and evolution. Sweepstakes reproduction (heavy right-tailed offspring number distribution) is central for understanding the ecology and evolution of highly fecund natural populations. Sweepstakes reproduction can induce jumps in type frequencies and multiple mergers in gene genealogies of sampled gene copies. We take sweepstakes reproduction to be skewed offspring number distribution due to mechanisms not involving natural selection, such as in chance matching of broadcast spawning with favourable environmental conditions. Here, we consider population genetic models of sweepstakes reproduction in a diploid panmictic populations absent selfing and evolving in a random environment. Our main results are (i) continuous-time Beta and Poisson-Dirichlet coalescents, when combining the results the skewness parameter of the Beta-coalescent ranges from 0 to 2, and the Beta-coalescents may be incomplete due to an upper bound on the number of potential offspring produced by any pair of parents; (ii) in large populations time is measured in units proportional to either N/ log N or N generations (where 2N is the population size when constant); (iii) it follows that incorporating population size changes leads to time-changed coalescents with the time-change independent of ; (iv) using simulations we show that the ancestral process is not well approximated by the corresponding coalescent (as measured through certain functionals of the processes); (v) whenever the skewness of the offspring number distribution is increased the conditional (conditioned on the population ancestry) and the unconditional ancestral processes are not in good agreement.

2
Gene genealogies in haploid populations evolving according to sweepstakes reproduction

Eldon, B.

2026-01-12 evolutionary biology 10.64898/2026.01.08.698389 medRxiv
Top 0.1%
26.1%
Show abstract

Recruitment dynamics, or the distribution of the number of offspring among individuals, is fundamental to ecology and evolution. We take sweepstakes reproduction to mean a skewed (heavy right-tailed) offspring number distribution without natural selection being involved. Sweepstakes may be generated by chance matching of reproduction with favorable environmental conditions. Gene genealogies generated by sweepstakes reproduction are in the domain of attraction of multiple-merger coalescents where a random number of lineages merges at such times. We consider population genetic models of sweepstakes reproduction for haploid panmictic populations of both constant (N), and varying population size, and evolving in a random environment. We construct our models so that we can recover the observed number of new mutations in a given sample without requiring strong assumptions regarding the population size or the mutation rate. Our main results are (i) continuous-time coalescents that are either the Kingman coalescent or specific families of Beta- or Poisson-Dirichlet coalescents; when combining the results the parameter of the Beta-coalescent ranges from 0 to 2, and the Beta-coalescents may be incomplete due to an upper bound on the number of potential offspring an arbitrary individual may produce; (ii) in large populations we measure time in units proportional to either N/log N or N generations; (iii) incorporating fluctuations in population size leads to time-changed multiple-merger coalescents where the time-change does not depend on ; (iv) using simulations we show that in some cases approximations of functionals of a given coalescent do not match the ones of the ancestral process in the domain of attraction of the given coalescent; (v) approximations of functionals obtained by conditioning on the population ancestry (the ancestral relations of all gene copies at all times) are broadly similar (for the models considered here) to the approximations obtained without conditioning on the population ancestry.

3
The Behaviour of F-statistics over Time

Li, S.; Wiuf, C.

2022-08-26 genetics 10.1101/2022.08.25.505252 medRxiv
Top 0.1%
23.6%
Show abstract

We study the behaviour of the F2-statistic and Fst-statistic, respectively, over time in a Wright-Fisher model with mutation and migration. We give precise conditions for when the F2-statistic is non-monotonic, that is, increases over time until a certain point and then starts decreasing. We show that even for small population sizes, the two statistics are well approximated by population size scaled expressions.

4
A classification of structured coalescent processes with migration, conditional on the population pedigree

Lessard, S.; Easlick, T.; Wakeley, J.

2026-02-19 evolutionary biology 10.64898/2026.02.18.706396 medRxiv
Top 0.1%
23.0%
Show abstract

Recent analyses of the effects that organismal genealogies or pedigrees of populations have on times to common ancestry for samples of genetic data are extended to cases of population subdivision and migration. Traditional coalescent models marginalize over pedigrees. A finding of a pedigree effect implies that data analysis and interpretation should not be based on the corresponding traditional coalescent model but rather on a coalescent model obtained by conditioning on the pedigree. We apply a straightforward test based on the distribution of pairwise coalescence times to four previously described scenarios of subdivision and migration. These scenarios are defined by the relative magnitudes of four parameters: the number of the local populations or demes, the deme size, the migration fraction, and the probability that migration can occur at all. We find pedigree effects in three scenarios. In two, the effect is weak if the deme size is large. The one scenario without a pedigree effect corresponds to the well known structured-coalescent model. The one scenario with a persistent pedigree effect even in the limit as the deme size tends to infinity involves long periods without gene flow interrupted by pulses of migration. We illustrate our results using simulations and numerical analysis.

5
Beta-coalescents when sample size is large

Eldon, B.; Chetwynd-Diggle, J. A.

2026-01-02 evolutionary biology 10.64898/2025.12.30.697022 medRxiv
Top 0.1%
18.1%
Show abstract

Individual recruitment success, or the offspring number distribution of a given population, is a fundamental element in ecology and evolution. Sweepstakes reproduction refers to a highly skewed individual recruitment success without involving natural selection and may apply to individuals in broadcast spawning populations characterised by Type III survivorship. We consider an extension of the Schweinsberg (2003) model of sweepstakes reproduction for a haploid panmictic population of constant size N; the extension also works as an alternative to the Wright-Fisher model. Our model incorporates an upper bound on the random number of potential offspring (juveniles) produced by a given individual. Depending on how the bound behaves relative to the total population size, we obtain the Kingman (1982a,c,b) coalescent, an incomplete Beta-coalescent, or the (complete) Beta-coalescent of Schweinsberg (2003). We argue that applying such an upper bound is biologically reasonable. Moreover, we estimate the error of the coalescent approximation. The error estimates reveal that convergence can be slow, and small sample size can be sufficient to invalidate convergence, for example if the stated bound is of the form N/ log N. We use simulations to investigate the effect of increasing sample size on the site-frequency spectrum. When the limit is a Beta-coalescent, the site frequency spectrum will be as predicted by the limiting tree even though the full coalescent tree may deviate from the limiting one. When in the domain of attraction of the Kingman coalescent the effect of increasing sample size depends on the effective population size as has been noted in the case of the Wright-Fisher model. Conditioning on the population ancestry (the random ancestral relations of the entire population at all times) may have little effect on the site-frequency spectrum for the models considered here (as evidenced by simulation results).

6
Wald's martingale and the Moran process

Monk, T. E.; van Schaik, A.

2020-02-24 evolutionary biology 10.1101/2020.02.24.962407 medRxiv
Top 0.1%
18.0%
Show abstract

Many models of evolution are stochastic processes, where some quantity of interest fluctuates randomly in time. One classic example is the Moran birth-death process, where that quantity is the number of mutants in a population. In such processes we are often interested in their absorption (i.e. fixation) probabilities, and the conditional distributions of absorption time. Those conditional time distributions can be very difficult to calculate, even for relatively simple processes like the Moran birth-death model. Instead of considering the time to absorption, we consider a closely-related quantity: the number of mutant population size changes before absorption. We use Walds martingale to obtain the conditional characteristic functions of that quantity in the Moran process. Our expressions are novel, analytical, and exact. The parameter dependence of the characteristic functions is explicit, so it is easy to explore their properties in parameter space. We also use them to approximate the conditional characteristic functions of absorption time. We state the conditions under which that approximation is particularly accurate. Martingales are an elegant framework to solve principal problems of evolutionary stochastic processes. They do not require us to evaluate recursion relations, so we can quickly and tractably obtain absorption probabilities and times of evolutionary stochastic processes. Author summaryThe Moran process is a probabilistic birth-death model of evolution. A mutant is introduced to an indigenous population, and we randomly choose organisms to live or die on subsequent time steps. Our goals are to calculate the probabilities that the mutant eventually dominates the population or goes extinct, and the distribution of time it requires to do so. The conditional distributions of time are difficult to obtain for the Moran process, so we consider a slightly different but related problem. We instead calculate the conditional distributions of the number of times that the mutant population size changes before it dominates the population or goes extinct. We use a martingale identified by Abraham Wald to obtain elegant and exact expressions for those distributions. We then use them to approximate conditional time distributions, and we show when that approximation is accurate. Our analysis outlines the basic concepts martingales and demonstrates why they are a formidable tool for studying probabilistic evolutionary models such as the Moran process.

7
Adaptive dynamics of memory-1 strategies in the repeated donation game

LaPorte, P.; Hilbe, C.; Nowak, M.

2023-03-03 evolutionary biology 10.1101/2023.03.02.530745 medRxiv
Top 0.1%
17.7%
Show abstract

Social interactions often take the form of a social dilemma: collectively, individuals fare best if everybody cooperates, yet each single individual is tempted to free ride. Social dilemmas can be resolved when individuals interact repeatedly. Repetition allows individuals to adopt reciprocal strategies which incentivize cooperation. The most basic model to study reciprocity is the repeated donation game, a variant of the repeated prisoners dilemma. Two players interact over many rounds, in which they repeatedly decide whether to cooperate or to defect. To make their decisions, they need a strategy that tells them what to do depending on the history of previous play. Memory-1 strategies depend on the previous round only. Even though memory-1 strategies are among the most elementary strategies of reciprocity, their evolutionary dynamics has been difficult to study analytically. As a result, most previous work relies on simulations. Here, we derive and analyze their adaptive dynamics. We show that the four-dimensional space of memory-1 strategies has an invariant three-dimensional subspace, generated by the memory-1 counting strategies. Counting strategies record how many players cooperated in the previous round, without considering who cooperated. We give a partial characterization of adaptive dynamics for memory-1 strategies and a full characterization for memory-1 counting strategies. Author summaryDirect reciprocity is a mechanism for evolution of cooperation based on the repeated interaction of the same players. In the most basic setting, we consider a game between two players and in each round they choose between cooperation and defection. Hence, there are four possible outcomes: (i) both cooperate; (ii) I cooperate, you defect; (ii) I defect, you cooperate; (iv) both defect. A memory-1 strategy for playing this game is characterized by four quantities which specify the probabilities to cooperate in the next round depending on the outcome of the current round. We study evolutionary dynamics in the space of all memory-1 strategies. We assume that mutant strategies are generated in close proximity to the existing strategies, and therefore we can use the framework of adaptive dynamics, which is deterministic.

8
The Age of Selection-Duality Mutation under Fluctuating Selection among Individuals (FSI)

Gu, X.

2026-02-02 evolutionary biology 10.64898/2026.01.30.701161 medRxiv
Top 0.1%
17.0%
Show abstract

Our recent work on molecular evolution and population genetics postulated that individuals with a specific mutation exhibit a fluctuation in fitness, short for FSI (fluctuating selection among individuals), whereas the fitness effect of wildtype remains a constant. An intriguing phenomenon called selection-duality emerges, that is, a slightly beneficial mutation could be a negative selection (the substitution rate less than the mutation rate). It appears that selection-duality is bounded by two bounds: the generic neutrality where the mutation is neutral by the means of fitness on average, and the substitution neutrality where the substitution rate equals to the mutation rate. In addition, the middle point of generic neutrality and substitution neutrality is called the FSI-neutrality. An important problem is about the age profile of allele frequency, i.e., the arising timing of a mutation whose frequency in the current population is given (the allele-age problem for short). Solving this problem under selection duality would help extend the standard coalescent theory that based on strict neutrality to a more general form under selection duality. In this paper, we studied the allele-age problem under selection-duality by the first arrival time approach and the mean age approach, respectively. Since the general solution of allele-age problem under selection duality is not available, we focused on solving the problem at the substitution neutrality (the up-bound of selection duality), the FSI-neutrality (the middle-point) and the generic neutrality (the low-bound), respectively. Our analysis results in an overall picture that the mean first-arrival age of a mutation at the substitution neutrality is theoretically identical to that at the FSI-neutrality, which is numerically close to that at the generic neutrality. For illustration, we calculated the mean age of nonsynonymous mutations in the human population and demonstrated that the estimated allele-age could be overestimated considerably when the effect of FSI was neglected.

9
Monomorphic ESS does not imply the stability of the corresponding polymorphic state in the replicator dynamics in matrix games under time constraints

Varga, T.; Garay, J.

2021-08-06 evolutionary biology 10.1101/2021.08.05.455237 medRxiv
Top 0.1%
14.7%
Show abstract

Matrix games under time constraints are natural extensions of matrix games. They consider the fact that, in addition to the payoff, a pairwise interaction has a further consequence for the contestants. Namely, both players have to wait for some time before becoming fit to participate in a subsequent interaction. Every matrix game can be assigned a continuous dynamical system (the replicator equation) which describes how the frequencies of different phenotypes evolve in the population. One of the fundamental theorems of evolutionary matrix games asserts that the state corresponding to an evolutionarily stable strategy is an asymptotically stable rest point of the replicator equation (Taylor and Yonker 1978, Hofbauer et al. 1979, Zeeman 1980). Garay et al. (2018) and Varga et al. (2020) generalized the statement to two-strategy and, in some particular cases, three- or more strategy matrix games under time constraints. However, the question of whether the implication holds in general remained open. Here examples are provided demonstrating that the answer is no. Moreover, we point out through the rock-scissor-paper game that arbitrary small differences between waiting times can destabilize the rest point corresponding to an ESS. It is also shown that a stable limit cycle can arise around the unstable rest point in a supercritical Hopf bifurcation. Mathematics Subject Classification91A22, 92D15, 92D25, 91A80, 91A05, 91A10, 91A40, 92D40

10
General moment closure for the neutral two-locus Wright-Fisher dynamics

Kundagrami, R.; Yetter, S.; Steinruecken, M.

2026-01-20 genetics 10.64898/2026.01.16.700021 medRxiv
Top 0.1%
14.6%
Show abstract

The Wright-Fisher diffusion and its dual, the coalescent process, are at the core of many results and methods in population genetics. Approaches have been developed to study the dynamics of its moments under genetic drift, mutation, and recombination using ordinary differential equations. The dynamics of these moments can be used to study population genetic processes and are key building blocks of efficient methods to infer population genetic parameters, like demographic histories or fine-scale recombination rates. However, the system of equations does not close under recombination; that is, computing moments of a certain order requires knowledge of moments of higher order. By applying a coordinate transformation to the diffusion generator, we show that the canonical moments in these alternative coordinates yield a closed system, enabling more accurate numerical computations. Compared to previous approaches in the literature, we believe that this approach can be more readily extended to general scenarios. Through simulations, we verify that the derived system of differential equations can accurately capture the dynamics of the moments, and can be used to efficiently compute expected diversity and linkage statistics in population genetic samples.

11
A mathematical synthesis of genetics, development, and evolution

Gonzalez-Forero, M.

2026-02-26 evolutionary biology 10.64898/2026.02.25.707927 medRxiv
Top 0.1%
14.0%
Show abstract

Mathematically integrating genetics, development, and evolution is a longstanding challenge. Here I develop general mathematical theory that integrates sexual, discrete, multilocus genetics, development, and evolution. This yields an exact method to describe the evolutionary dynamics of allele frequencies and linkage disequilibria in multilocus systems and the associated evolutionary dynamics of mean phenotypes constructed via arbitrarily complex developmental processes. The theory shows how development affects evolution under realistic genetics, namely by shaping the fitness landscape of allele frequencies and linkage disequilibria and by constraining adaptation to an admissible evolutionary manifold (high dimensional region on the landscape) where mean phenotypes, phenotype (co-)variances, and higher moments can be developed. I derive a first-order approximation of this exact method, which yields equations in gradient form describing change in allele frequency, linkage disequilibria, and mean phenotypes as constrained, sometimes-adaptive topographies. Both the exact and approximated equations describe long-term phenotypic and genetic evolution, including the evolution of mean phenotypes, phenotype covariance matrices, "mechanistic" additive genetic cross-covariance matrices, and higher moments. I provide worked examples to illustrate the methods. The theory obtained is referred to as evo-devo dynamics, which can be interpreted as an extension of population genetics, with some similarities to quantitative genetics but with fundamental differences. The theory provides tools to re-assess empirical observations that have been paradoxical under previous theory, such as the maintenance of genetic variation, the paradox of stasis, the paradox of predictability, and the rarity of stabilising selection, which appear less paradoxical in this theory.

12
Numerical simulation of the two-locus Wright-Fisher stochastic differential equation with application to approximating transition probability densities

He, Z.; Beaumont, M. A.; Yu, F.

2020-07-21 genetics 10.1101/2020.07.21.213769 medRxiv
Top 0.1%
12.2%
Show abstract

Over the past decade there has been an increasing focus on the application of the Wright-Fisher diffusion to the inference of natural selection from genetic time series. A key ingredient for modelling the trajectory of gene frequencies through the Wright-Fisher diffusion is its transition probability density function. Recent advances in DNA sequencing techniques have made it possible to monitor genomes in great detail over time, which presents opportunities for investigating natural selection while accounting for genetic recombination and local linkage. However, most existing methods for computing the transition probability density function of the Wright-Fisher diffusion are only applicable to one-locus problems. To address two-locus problems, in this work we propose a novel numerical scheme for the Wright-Fisher stochastic differential equation of population dynamics under natural selection at two linked loci. Our key innovation is that we reformulate the stochastic differential equation in a closed form that is amenable to simulation, which enables us to avoid boundary issues and reduce computational costs. We also propose an adaptive importance sampling approach based on the proposal introduced by Fearnhead (2008) for computing the transition probability density of the Wright-Fisher diffusion between any two observed states. We show through extensive simulation studies that our approach can achieve comparable performance to the method of Fearnhead (2008) but can avoid manually tuning the parameter{rho} to deliver superior performance for different observed states.

13
Local adaptation in a metapopulation - a multi-habitat perspective

Olusanya, O.; Barton, N. H.; Polechova, J.

2025-05-07 evolutionary biology 10.1101/2025.05.03.652039 medRxiv
Top 0.1%
12.0%
Show abstract

This study extends existing soft selection models of local adaptation in metapopulations from two habitats to a multi-habitat scenario, where each habitat exerts unique selection pressures. Specifically, we examine a three-habitat multilocus model in which each allele is favored in habitat 1, disfavored in habitat 3, and the selection pressure in the intermediate habitat may be different across loci. Employing the diffusion and fixed state approximations under the assumption of linkage equilibrium, we investigate conditions for the persistence of a polymorphism. We derive analytical thresholds for such persistence, which reveal scaling for the model parameters, local deme size (N), migration rate(m), selection pressure (si) and the proportion, (i) of each habitat. We show that under the assumption of infinitely many islands and selective neutrality in the intermediate habitat, the size of the intermediate habitat does not affect the maintenance of polymorphism. With symmetric selection pressure (s1=s3=s) in habitats 1 and 3, the system can be fully characterized by the product Ns, the product Nm, and a parameter {beta}, defined as the ratio of the size of habitat 1 (favoring the allele) to habitat 3 (where the allele is disfavored). We find that the range of polymorphism widens as gene flow between demes decreases and the symmetry of habitats increases ({beta} approaches 1). In the final section, we explore the effect of drift on the critical migration threshold as well as the effect of symmetry between selection. We demonstrate that genetic drift considerably lowers the critical migration threshold required for the maintenance of polymorphism. Furthermore, when each island is small but there are (infinitely) many of them, relatively low levels of gene flow can have a large impact in preventing genetic differentiation in a fragmented population.

14
Mathematical constraints on FST: multiallelic markers in arbitrarily many populations

Alcala, N.; Rosenberg, N. A.

2021-07-25 genetics 10.1101/2021.07.23.453474 medRxiv
Top 0.1%
10.5%
Show abstract

Interpretations of values of the FST measure of genetic differentiation rely on an understanding of its mathematical constraints. Previously, it has been shown that FST values computed from a biallelic locus in a set of multiple populations and FST values computed from a multiallelic locus in a pair of populations are mathematically constrained as a function of the frequency of the allele that is most frequent across populations. We generalize from these cases to report here the mathematical constraint on FST given the frequency M of the most frequent allele at a multiallelic locus in a set of multiple populations. Using coalescent simulations of an island model of migration with an infinitely-many-alleles mutation model, we argue that the joint distribution of FST and M helps in disentangling the separate influences of mutation and migration on FST. Finally, we show that our results explain a puzzling pattern of microsatellite differentiation: the lower FST in an interspecific comparison between humans and chimpanzees than in the comparison of chimpanzee populations. We discuss the implications of our results for the use of FST.

15
Multivariate Trait Evolution: Models for the Evolution of the Quantitative Genetic G-Matrix on Phylogenies.

Blomberg, S. P.; Muniz, M.; Bui, M. N.; Janke, C.

2025-05-21 evolutionary biology 10.1101/2024.10.26.620394 medRxiv
Top 0.1%
10.2%
Show abstract

Genetic covariance matrices (G-matrices) are a key focus for research and predictions from quantitative genetic evolutionary models of multiple traits. There is a consensus among quantitative geneticists that the G-matrix can evolve through deep time. Yet, quantitative genetic models for the evolution of the G-matrix are conspicuously lacking. In contrast, the field of macroevolution has several stochastic models for univariate traits evolving on phylogenies. However, despite much research into multivariate phylogenetic comparative methods, analytical models of how multivariate trait matrices might evolve on phylogenies have not been considered. Here we show how three analytical models for the evolution of matrices and multivariate traits on phylogenies, based on Lie group theory, Riemannian geometry and stochastic differential (diffusion) equations, can be combined to unify quantitative genetics and macroevolutionary theory in a coherent mathematical framework. The models provide a basis for understanding how G-matrices might evolve on phylogenies, and we show how to fit models to data via simulation using Approximate Bayesian Computation. Such models can be used to generate and test hypotheses about the evolution of genetic variances and covariances, together with the evolution of the traits themselves, and how these might vary across a phylogeny. This unification of macroevolutionary theory and quantitative genetics is an important advance in the study of phenotypes, allowing for the construction of a synthetic quantitative theory of the evolution of species and multivariate traits over deep time. Lay SummaryWe unite Quantitative Genetics, the major mathematical theory of multivariate quantitative trait microevolution, with the mathematical theory of multivariate macroevolution. To do this, we allow the key component of quantitative genetic theory, the matrix of additive genetic variances and covariances (the G-matrix) to evolve along evolutionary trees. This is an advance because the G-matrix is assumed to be constant in quantitative genetics (for convenience), but it has been recognised that it evolves on macroevolutionary timescales (in deep time). Uniting Quantitative Genetics with macroevolutionary theory allows for a more complete mathematical description of Darwins theory of evolution, and allows for further testing of evolutionary hypotheses.

16
Estimating disease spread using structured coalescent andbirth-death models: A quantitative comparison

Seidel, S.; Stadler, T.; Vaughan, T.

2020-11-30 bioinformatics 10.1101/2020.11.30.403741 medRxiv
Top 0.1%
10.0%
Show abstract

Understanding how disease transmission occurs between subpopulations is critically important for guiding disease control efforts irrespective of whether the subpopulations represent geographically separated people, age or risk groups. The structured coalescent (SC) and the multitype birth-death (MBD) model can both be used to infer migration rates between subpopulations from phylogenies reconstructed from pathogen genetic sequences. However, the two classes of phylodynamic methods rely on different assumptions. Here, we report on a simulation study which compares inferences made using these models for a variety of migration rates in both endemic diseases and epidemic outbreaks. For the epidemic outbreak, we found that the MBD recovers the true migration rates better than the SC regardless of migration rate. We hypothesize that the inaccurate SC estimates stem from the its assumption of a constant population size. For the endemic scenario, our analysis shows that both models obtain a similar coverage of the migration rates, while the SC provides slightly narrower posterior intervals. Irrespective of the scenario, both models estimate the root location with similar coverage. Our study provides concrete modelling advice for infectious disease analysts. For endemic disease either model can be used, while for epidemic outbreaks the MBD should be the model of choice. Additionally, our study reveals the need to develop the SC further such that varying population sizes can easily be taken into account. Author summaryControlling an infectious disease requires us to quantify and understand how it spreads through pools of susceptible individuals, defined by their belonging to different geographical regions, age or risk groups. Rates of pathogen movement between these pools can be inferred from pathogen phylogenies which are themselves reconstructed from pathogen genetic sequences collected from infected individuals. Two popular foundations for such models are the multitype birth-death model and the structured coalescent. Although these models fulfill the same purpose, they differ in their assumptions and can, hence, produce contrasting results. To assess the appropriateness of the models in different situations, we performed a simulation study. We find that, for endemic diseases, both models are able to estimate the migration parameters reliably. For epidemic outbreaks, however, the multitype birth-death model obtains better estimates of the migration rates. We hypothesize that the structured coalescents inaccurate estimates for the epidemic scenario arise because it assumes a constant number of infected individuals through time.

17
The extended Price equation: migration and sex

Brown, J.; Field, J. M.

2021-09-30 evolutionary biology 10.1101/2021.09.28.462138 medRxiv
Top 0.1%
9.9%
Show abstract

The Price equation provides a general partition of evolutionary change into two components. The first is usually thought to represent natural selection and the second, transmission bias. Here, we provide a new derivation of the generalised equation, which contains a largely ignored third term. Unlike the original Price equation, this extension can account for migration and mixed asexual and sexual reproduction. The notation used here expresses the generalised equation explicitly in terms of fitness, rendering this otherwise difficult third term more open to biological interpretation and use. This re-derivation also permits fundamental results, derived from the Price equation, to be more easily generalised. We take Hamiltons rule as a case study, and provide an exact, total expression that allows for population structures like haplodiploidy. Our analysis, more generally, makes clear the previously hidden assumptions in similar fundamental results, highlighting the caution that must be taken when interpreting them.

18
Survival of the frequent at finite population size and mutation rate: bridging the gap between quasispecies and monomorphic regimes with a simple model

Khatri, B. S.

2020-02-04 evolutionary biology 10.1101/375147 medRxiv
Top 0.1%
9.9%
Show abstract

In recent years, there has been increased attention on the non-trivial role that genotype-phenotype maps play in the course of evolution, where natural selection acts on phenotypes, but variation arises at the level of mutations. Understanding such mappings is arguably the next missing piece in a fully predictive theory of evolution. Although there are theoretical descriptions of such mappings for the monomorphic (N << 1) and deterministic or very strong mutation (N [&gt;&gt;&gt;] 1) limit, given by developments of Iwasas free fitness and quasispecies theories, respectively, there is no general description for the intermediate regime where N ~ 1. In this paper, we address this by transforming Wrights well-known stationary distribution of genotypes under selection and mutation to give the probability distribution of phenotypes, assuming a general genotype-phenotype map. The resultant distribution shows that the degeneracies of each phenotype appear by weighting the mutation term; this gives rise to a bias towards phenotypes of larger degeneracy analogous to quasispecies theory, but at finite population size. On the other hand we show that as population size is decreased, again phenotypes of higher degeneracy are favoured, which is a finite mutation description of the effect of sequence entropy in the monomorphic limit. We also for the first time (to the authors knowledge) provide an explicit derivation of Wrights stationary distribution of the frequencies of multiple alleles.

19
Taming Strong Selection with Large Sample Sizes

Gravel, S.; Krukov, I.

2021-03-30 genetics 10.1101/2021.03.30.437711 medRxiv
Top 0.1%
9.2%
Show abstract

1The fate of mutations and the genetic load of populations depend on the relative importance of genetic drift and natural selection. In addition, the accuracy of numerical models of evolution depends on the strength of both selection and drift: strong selection breaks the assumptions of the nearly neutral model, and drift coupled with large sample sizes breaks Kingmans coalescent model. Thus, the regime with strong selection and large sample sizes, relevant to the study of pathogenic variation, appears particularly daunting. Surprisingly, we find that the interplay of drift and selection in that regime can be used to define asymptotically closed recursions for the distribution of allele frequencies that are accurate well beyond the strong selection limit. Selection becomes more analytically tractable when the sample size n is larger than twice the population-scaled selection coefficient: n [&ge;] 2Ns (4Ns in diploids). That is, when the expected number of coalescent events in the sample is larger than the number of selective events. We construct the relevant transition matrices, show how they can be used to accurately compute distributions of allele frequencies, and show that the distribution of deleterious allele frequencies is sensitive to details of the evolutionary model.

20
Haldane’s Probability of Mutant Survival is Not the Probability of Allele Establishment

Krukov, I.; de Koning, A. P. J.

2019-07-16 genetics 10.1101/704577 medRxiv
Top 0.1%
9.1%
Show abstract

Haldane notably showed in 1927 that the probability of fixation for an advantageous allele is approximately 2s, for selective advantage s. This widely known result is variously interpreted as either the fixation probability or the establishment probability, where the latter is considered the likelihood that an allele will survive long enough to have effectively escaped loss by drift. While Haldane was concerned with escape from loss by drift in the same paper, in this short note we point out that: 1) Haldanes probability of survival is analogous to the probability of fixation in a Wright-Fisher model (as also shown by others); and 2) This result is unrelated to Haldanes consideration of how common an allele must be to probably spread through the species. We speculate that Haldanes survival probability may have become misunderstood over time due to a conflation of terminology about surviving drift and ultimately surviving (i.e., fixing). Indeed, we find that the probability of establishment remarkably appears to have been overlooked all these years, perhaps as a consequence of this misunderstanding. Using straightforward diffusion and Markov chain methods, we show that under Haldanes assumptions, where establishment is defined by eventual fixation being more likely that extinction, the establishment probability is actually 4s when the fixation probability is 2s. Generalizing consideration to deleterious, neutral, and adaptive alleles in finite populations, if establishment is defined by the odds ratio between eventual fixation and extinction, k, the general establishment probability is (1 + k)/k times the fixation probability. It is therefore 4s when k = 1, or 3s when k = 2 for beneficial alleles in large populations. As k is made large, establishment becomes indistinguishable from fixation, and ceases to be a useful concept. As a result, we recommend establishment be generally defined as when the odds of ultimate fixation are greater than for extinction (k = 1, following Haldane), or when fixation is twice as likely as extinction (k = 2).